---
title: "Reference: createScorer | Evals"
description: Documentation for creating custom scorers in Mastra, allowing users to define their own evaluation logic using either JavaScript functions or LLM-based prompts.
---

# createScorer

Mastra provides a unified `createScorer` factory that allows you to define custom scorers for evaluating input/output pairs. You can use either native JavaScript functions or LLM-based prompt objects for each evaluation step. Custom scorers can be added to Agents and Workflow steps.

## How to Create a Custom Scorer

Use the `createScorer` factory to define your scorer with a name, description, and optional judge configuration. Then chain step methods to build your evaluation pipeline. You must provide at least a `generateScore` step.

```typescript
import { createScorer } from "@mastra/core/evals";

const scorer = createScorer({
  id: "my-custom-scorer",
  name: "My Custom Scorer", // Optional, defaults to id
  description: "Evaluates responses based on custom criteria",
  type: "agent", // Optional: for agent evaluation with automatic typing
  judge: {
    model: myModel,
    instructions: "You are an expert evaluator...",
  },
})
  .preprocess({
    /* step config */
  })
  .analyze({
    /* step config */
  })
  .generateScore(({ run, results }) => {
    // Return a number
  })
  .generateReason({
    /* step config */
  });
```

## createScorer Options

<PropertiesTable
  content={[
    {
      name: "id",
      type: "string",
      isOptional: false,
      description: "Unique identifier for the scorer. Used as the name if `name` is not provided.",
    },
    {
      name: "name",
      type: "string",
      isOptional: true,
      description: "Name of the scorer. Defaults to `id` if not provided.",
    },
    {
      name: "description",
      type: "string",
      isOptional: false,
      description: "Description of what the scorer does.",
    },
    {
      name: "judge",
      type: "object",
      isOptional: true,
      description:
        "Optional judge configuration for LLM-based steps. See Judge Object section below.",
    },
    {
      name: "type",
      type: "string",
      isOptional: true,
      description:
        "Type specification for input/output. Use 'agent' for automatic agent types. For custom types, use the generic approach instead.",
    },
  ]}
/>

This function returns a scorer builder that you can chain step methods onto. See the [MastraScorer reference](./mastra-scorer) for details on the `.run()` method and its input/output.

## Judge Object

<PropertiesTable
  content={[
    {
      name: "model",
      type: "LanguageModel",
      isOptional: false,
      description: "The LLM model instance to use for evaluation.",
    },
    {
      name: "instructions",
      type: "string",
      isOptional: false,
      description: "System prompt/instructions for the LLM.",
    },
  ]}
/>

## Type Safety

You can specify input/output types when creating scorers for better type inference and IntelliSense support:

### Agent Type Shortcut

For evaluating agents, use `type: 'agent'` to automatically get the correct types for agent input/output:

```typescript
import { createScorer } from "@mastra/core/evals";

// Agent scorer with automatic typing
const agentScorer = createScorer({
  name: "Agent Response Quality",
  description: "Evaluates agent responses",
  type: "agent", // Automatically provides ScorerRunInputForAgent/ScorerRunOutputForAgent
})
  .preprocess(({ run }) => {
    // run.input is automatically typed as ScorerRunInputForAgent
    const userMessage = run.inputData.inputMessages[0]?.content;
    return { userMessage };
  })
  .generateScore(({ run, results }) => {
    // run.output is automatically typed as ScorerRunOutputForAgent
    const response = run.output[0]?.content;
    return response.length > 10 ? 1.0 : 0.5;
  });
```

### Custom Types with Generics

For custom input/output types, use the generic approach:

```typescript
import { createScorer } from "@mastra/core/evals";

type CustomInput = { query: string; context: string[] };
type CustomOutput = { answer: string; confidence: number };

const customScorer = createScorer<CustomInput, CustomOutput>({
  name: "Custom Scorer",
  description: "Evaluates custom data",
}).generateScore(({ run }) => {
  // run.input is typed as CustomInput
  // run.output is typed as CustomOutput
  return run.output.confidence;
});
```

### Built-in Agent Types

- **`ScorerRunInputForAgent`** - Contains `inputMessages`, `rememberedMessages`, `systemMessages`, and `taggedSystemMessages` for agent evaluation
- **`ScorerRunOutputForAgent`** - Array of agent response messages

Using these types provides autocomplete, compile-time validation, and better documentation for your scoring logic.

## Trace Scoring with Agent Types

When you use `type: 'agent'`, your scorer is compatible for both adding directly to agents and scoring traces from agent interactions. The scorer automatically transforms trace data into the proper agent input/output format:

```typescript
const agentTraceScorer = createScorer({
  name: "Agent Trace Length",
  description: "Evaluates agent response length",
  type: "agent",
}).generateScore(({ run }) => {
  // Trace data is automatically transformed to agent format
  const userMessages = run.inputData.inputMessages;
  const agentResponse = run.output[0]?.content;

  // Score based on response length
  return agentResponse?.length > 50 ? 0 : 1;
});

// Register with Mastra for trace scoring
const mastra = new Mastra({
  scorers: {
    agentTraceScorer,
  },
});
```

## Step Method Signatures

### preprocess

Optional preprocessing step that can extract or transform data before analysis.

**Function Mode:**
Function: `({ run, results }) => any`

<PropertiesTable
  content={[
    {
      name: "run.input",
      type: "any",
      isOptional: false,
      description:
        "Input records provided to the scorer. If the scorer is added to an agent, this will be an array of user messages, e.g. `[{ role: 'user', content: 'hello world' }]`. If the scorer is used in a workflow, this will be the input of the workflow.",
    },
    {
      name: "run.output",
      type: "any",
      isOptional: false,
      description:
        "Output record provided to the scorer. For agents, this is usually the agent's response. For workflows, this is the workflow's output.",
    },
    {
      name: "run.runId",
      type: "string",
      isOptional: false,
      description: "Unique identifier for this scoring run.",
    },
    {
      name: "run.requestContext",
      type: "object",
      isOptional: true,
      description:
        "Request Context from the agent or workflow step being evaluated (optional).",
    },
    {
      name: "results",
      type: "object",
      isOptional: false,
      description: "Empty object (no previous steps).",
    },
  ]}
/>

Returns: `any`  
The method can return any value. The returned value will be available to subsequent steps as `preprocessStepResult`.

**Prompt Object Mode:**

<PropertiesTable
  content={[
    {
      name: "description",
      type: "string",
      isOptional: false,
      description: "Description of what this preprocessing step does.",
    },
    {
      name: "outputSchema",
      type: "ZodSchema",
      isOptional: false,
      description: "Zod schema for the expected output of the preprocess step.",
    },
    {
      name: "createPrompt",
      type: "function",
      isOptional: false,
      description:
        "Function: ({ run, results }) => string. Returns the prompt for the LLM.",
    },
    {
      name: "judge",
      type: "object",
      isOptional: true,
      description:
        "(Optional) LLM judge for this step (can override main judge). See Judge Object section.",
    },
  ]}
/>

### analyze

Optional analysis step that processes the input/output and any preprocessed data.

**Function Mode:**
Function: `({ run, results }) => any`

<PropertiesTable
  content={[
    {
      name: "run.input",
      type: "any",
      isOptional: false,
      description:
        "Input records provided to the scorer. If the scorer is added to an agent, this will be an array of user messages, e.g. `[{ role: 'user', content: 'hello world' }]`. If the scorer is used in a workflow, this will be the input of the workflow.",
    },
    {
      name: "run.output",
      type: "any",
      isOptional: false,
      description:
        "Output record provided to the scorer. For agents, this is usually the agent's response. For workflows, this is the workflow's output.",
    },
    {
      name: "run.runId",
      type: "string",
      isOptional: false,
      description: "Unique identifier for this scoring run.",
    },
    {
      name: "run.requestContext",
      type: "object",
      isOptional: true,
      description:
        "Request Context from the agent or workflow step being evaluated (optional).",
    },
    {
      name: "results.preprocessStepResult",
      type: "any",
      isOptional: true,
      description: "Result from preprocess step, if defined (optional).",
    },
  ]}
/>

Returns: `any`  
The method can return any value. The returned value will be available to subsequent steps as `analyzeStepResult`.

**Prompt Object Mode:**

<PropertiesTable
  content={[
    {
      name: "description",
      type: "string",
      isOptional: false,
      description: "Description of what this analysis step does.",
    },
    {
      name: "outputSchema",
      type: "ZodSchema",
      isOptional: false,
      description: "Zod schema for the expected output of the analyze step.",
    },
    {
      name: "createPrompt",
      type: "function",
      isOptional: false,
      description:
        "Function: ({ run, results }) => string. Returns the prompt for the LLM.",
    },
    {
      name: "judge",
      type: "object",
      isOptional: true,
      description:
        "(Optional) LLM judge for this step (can override main judge). See Judge Object section.",
    },
  ]}
/>

### generateScore

**Required** step that computes the final numerical score.

**Function Mode:**
Function: `({ run, results }) => number`

<PropertiesTable
  content={[
    {
      name: "run.input",
      type: "any",
      isOptional: false,
      description:
        "Input records provided to the scorer. If the scorer is added to an agent, this will be an array of user messages, e.g. `[{ role: 'user', content: 'hello world' }]`. If the scorer is used in a workflow, this will be the input of the workflow.",
    },
    {
      name: "run.output",
      type: "any",
      isOptional: false,
      description:
        "Output record provided to the scorer. For agents, this is usually the agent's response. For workflows, this is the workflow's output.",
    },
    {
      name: "run.runId",
      type: "string",
      isOptional: false,
      description: "Unique identifier for this scoring run.",
    },
    {
      name: "run.requestContext",
      type: "object",
      isOptional: true,
      description:
        "Request Context from the agent or workflow step being evaluated (optional).",
    },
    {
      name: "results.preprocessStepResult",
      type: "any",
      isOptional: true,
      description: "Result from preprocess step, if defined (optional).",
    },
    {
      name: "results.analyzeStepResult",
      type: "any",
      isOptional: true,
      description: "Result from analyze step, if defined (optional).",
    },
  ]}
/>

Returns: `number`  
The method must return a numerical score.

**Prompt Object Mode:**

<PropertiesTable
  content={[
    {
      name: "description",
      type: "string",
      isOptional: false,
      description: "Description of what this scoring step does.",
    },
    {
      name: "outputSchema",
      type: "ZodSchema",
      isOptional: false,
      description:
        "Zod schema for the expected output of the generateScore step.",
    },
    {
      name: "createPrompt",
      type: "function",
      isOptional: false,
      description:
        "Function: ({ run, results }) => string. Returns the prompt for the LLM.",
    },
    {
      name: "judge",
      type: "object",
      isOptional: true,
      description:
        "(Optional) LLM judge for this step (can override main judge). See Judge Object section.",
    },
  ]}
/>

When using prompt object mode, you must also provide a `calculateScore` function to convert the LLM output to a numerical score:

<PropertiesTable
  content={[
    {
      name: "calculateScore",
      type: "function",
      isOptional: false,
      description:
        "Function: ({ run, results, analyzeStepResult }) => number. Converts the LLM's structured output into a numerical score.",
    },
  ]}
/>

### generateReason

Optional step that provides an explanation for the score.

**Function Mode:**
Function: `({ run, results, score }) => string`

<PropertiesTable
  content={[
    {
      name: "run.input",
      type: "any",
      isOptional: false,
      description:
        "Input records provided to the scorer. If the scorer is added to an agent, this will be an array of user messages, e.g. `[{ role: 'user', content: 'hello world' }]`. If the scorer is used in a workflow, this will be the input of the workflow.",
    },
    {
      name: "run.output",
      type: "any",
      isOptional: false,
      description:
        "Output record provided to the scorer. For agents, this is usually the agent's response. For workflows, this is the workflow's output.",
    },
    {
      name: "run.runId",
      type: "string",
      isOptional: false,
      description: "Unique identifier for this scoring run.",
    },
    {
      name: "run.requestContext",
      type: "object",
      isOptional: true,
      description:
        "Request Context from the agent or workflow step being evaluated (optional).",
    },
    {
      name: "results.preprocessStepResult",
      type: "any",
      isOptional: true,
      description: "Result from preprocess step, if defined (optional).",
    },
    {
      name: "results.analyzeStepResult",
      type: "any",
      isOptional: true,
      description: "Result from analyze step, if defined (optional).",
    },
    {
      name: "score",
      type: "number",
      isOptional: false,
      description: "Score computed by the generateScore step.",
    },
  ]}
/>

Returns: `string`  
The method must return a string explaining the score.

**Prompt Object Mode:**

<PropertiesTable
  content={[
    {
      name: "description",
      type: "string",
      isOptional: false,
      description: "Description of what this reasoning step does.",
    },
    {
      name: "createPrompt",
      type: "function",
      isOptional: false,
      description:
        "Function: ({ run, results, score }) => string. Returns the prompt for the LLM.",
    },
    {
      name: "judge",
      type: "object",
      isOptional: true,
      description:
        "(Optional) LLM judge for this step (can override main judge). See Judge Object section.",
    },
  ]}
/>

All step functions can be async.
